11 research outputs found

    Temporal error concealment for fisheye video sequences based on equisolid re-projection

    Full text link
    Wide-angle video sequences obtained by fisheye cameras exhibit characteristics that may not very well comply with standard image and video processing techniques such as error concealment. This paper introduces a temporal error concealment technique designed for the inherent characteristics of equisolid fisheye video sequences by applying a re-projection into the equisolid domain after conducting part of the error concealment in the perspective domain. Combining this technique with conventional decoder motion vector estimation achieves average gains of 0.71 dB compared against pure decoder motion vector estimation for the test sequences used. Maximum gains amount to up to 2.04 dB for selected frames

    Learning to Predict Image-based Rendering Artifacts with Respect to a Hidden Reference Image

    Full text link
    Image metrics predict the perceived per-pixel difference between a reference image and its degraded (e. g., re-rendered) version. In several important applications, the reference image is not available and image metrics cannot be applied. We devise a neural network architecture and training procedure that allows predicting the MSE, SSIM or VGG16 image difference from the distorted image alone while the reference is not observed. This is enabled by two insights: The first is to inject sufficiently many un-distorted natural image patches, which can be found in arbitrary amounts and are known to have no perceivable difference to themselves. This avoids false positives. The second is to balance the learning, where it is carefully made sure that all image errors are equally likely, avoiding false negatives. Surprisingly, we observe, that the resulting no-reference metric, subjectively, can even perform better than the reference-based one, as it had to become robust against mis-alignments. We evaluate the effectiveness of our approach in an image-based rendering context, both quantitatively and qualitatively. Finally, we demonstrate two applications which reduce light field capture time and provide guidance for interactive depth adjustment.Comment: 13 pages, 11 figure

    Pixel-Wise Confidences for Stereo Disparities Using Recurrent Neural Networks

    No full text
    One of the inherent problems with stereo disparity estimation algorithms is the lack of reliability information for the computed disparities. As a consequence, errors from the initial disparity maps are propagated to the following processing steps such as view rendering. Nowadays, confidence measures belong to the most popular techniques because of their capability to detect disparity outliers. Recently, convolutional neural network based confidence measures achieved best results by directly processing initial disparity maps. In contrast to existing convolutional neural network based methods, we propose a novel recurrent neural network architecture to compute confidences for different stereo matching algorithms. To maintain a low complexity the confidence for a given pixel is purely computed from its associated matching costs without considering any additional neighbouring pixels. As compared to the state-of-the-art confidence prediction methods leveraging convolutional neural networks, the proposed network is simpler and smaller in terms of size (reduction of the number of trainable parameters by almost 3-4 orders of magnitude). Moreover, the experimental results on three well-known datasets as well as with two popular stereo algorithms clearly highlight that the proposed approach outperforms state-of-the-art confidence estimation techniques

    Non-planar inside-out dense light-field dataset and reconstruction pipeline

    No full text
    Light-field imaging provides full spatio-angular information of the real world by capturing the light rays in various directions. This allows image processing algorithms to result in immersive user experiences such as VR. To evaluate, and develop reconstruction algorithms, a precise and dense light-field dataset of the real world that can be used as ground truth is desirable. Fraunhofer IIS presents a dataset that includes two scenes that are captured by an accurate industrial robot with an attached color camera such that the camera is looking outward. The arm moves on a cylindrical path for a field of view of 125 degrees with angular step size of 0,01 degrees. The images are pre-processed in different steps. The disparity between two adjacent views with resolution of 5168x3448 is less than 1,6 pixels; the parallax between the foreground and the background objects is less than 0,6 pixels. The dataset is based on the paper "Non-planar inside-out dense light-field dataset and reconstruction pipeline" by Faezeh Sadat Zakeri, Ahmed Durmush, Matthias Ziegler, Michel Bätz, and Joachim Keinert.Light-field imaging provides full spatio-angular information of the real world by capturing the light rays in various directions. This allows image processing algorithms to result in immersive user experiences such as VR. To evaluate, and develop reconstruction algorithms, a precise and dense light-field dataset of the real world that can be used as ground truth is desirable. Fraunhofer IIS presents a dataset that includes two scenes that are captured by an accurate industrial robot with an attached color camera such that the camera is looking outward. The arm moves on a cylindrical path for a field of view of 125 degrees with angular step size of 0,01 degrees. The images are pre-processed in different steps. The disparity between two adjacent views with resolution of 5168x3448 is less than 1,6 pixels; the parallax between the foreground and the background objects is less than 0,6 pixels. The dataset is based on the paper "Non-planar inside-out dense light-field dataset and reconstruction pipeline" by Faezeh Sadat Zakeri, Ahmed Durmush, Matthias Ziegler, Michel Bätz, and Joachim Keinert

    TEDDY: A High-Resolution High Dynamic Range Light-Field Dataset

    No full text
    Light-field (LF) imaging has various advantages over the traditional 2D photography, providing angular information of the real world scene by separately recording light rays in different directions. Despite the directional light information which enables new capabilities such as depth estimation, post-capture refocusing, and 3D modelling, currently available light-field datasets are very restricted in terms of spatialresolution and dynamic range. We address this problem by capturing a novel light-field dataset featuring both a high spatial resolution and a high dynamic range (HDR). This dataset should enable the community to research and develop efficient reconstruction and tone-mapping algorithms for a hyper-realistic visual experience. The dataset consists of six static light-fields that are captured by a high-quality digital camera mounted on two precise linear axes using exposure bracketing at each view point.Light-field (LF) imaging has various advantages over the traditional 2D photography, providing angular information of the real world scene by separately recording light rays in different directions. Despite the directional light information which enables new capabilities such as depth estimation, post-capture refocusing, and 3D modelling, currently available light-field datasets are very restricted in terms of spatial-resolution and dynamic range. In this work, we address this problem by capturing a novel light-field dataset featuring both a high spatial resolution and a high dynamic range (HDR). This dataset should enable the community to research and develop efficient reconstruction and tone-mapping algorithms for a hyper-realistic visual experience. The dataset consists of four static light-fields that are captured by a high-quality digital camera mounted on two precise linear axes using exposure bracketing at each view point.See Readme.txt

    A Novel Confidence Measure for Disparity Maps by Pixel-Wise Cost Function Analysis

    No full text
    Disparity estimation algorithms mostly lack information about the reliability of the disparities. Therefore, errors in initial disparity maps are propagated in consecutive processing steps. This is in particularly problematic for difficult scene elements, e.g., periodic structures. Consequently, we introduce a simple, yet novel confidence measure that filters out wrongly computed disparities, resulting in improved final disparity maps. To demonstrate the benefit of this approach, we compare our method with existing state-of-the-art confidence measures and show that we improve the ability to detect false disparities by 54.2%

    Light-field view synthesis using convolutional block attention module

    No full text
    Consumer light-field (LF) cameras suffer from a low or limited resolution because of the angular-spatial trade-off. To alleviate this drawback, we propose a novel learning-based approach utilizing attention mechanism to synthesize novel views of a light-field image using a sparse set of input views (i.e., 4 corner views) from a camera array. In the proposed method, we divide the process into three stages, stereo-feature extraction, disparity estimation, and final image refinement. We use three sequential convolutional neural networks for each stage. A residual convolutional block attention module (CBAM) is employed for final adaptive image refinement. Attention modules are helpful in learning and focusing more on the important features of the image and are thus sequentially applied in the channel and spatial dimensions. Experimental results show the robustness of the proposed method. Our proposed network outperforms the state-of-the-art learning-based light-field view synthesis methods on two challenging real-world datasets by 0.5 dB on average. Furthermore, we provide an ablation study to substantiate our findings
    corecore